About the Size of Boyer-moore Automata

نویسندگان

  • Olivier Delgrange
  • Rodrigo Scheihing
چکیده

We study the size of Boyer-Moore automata introduced in Knuth, Morris & Pratt's famous paper on pattern matching. We experimentally exhibit a nite class of binary patterns, which produce large Boyer-Moore automata. The best approximation curve for their sizes is a polynomial O(m 7), or even an exponential O(2 0:4m), in the length m of the patterns. All the previously known maximal sizes were at most cubic in m. Our results suggest to study two particular innnite classes of patterns, for which we conjecture that the generated automata have size (m 5).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exact Analysis of Pattern Matching Algorithms with Probabilistic Arithmetic Automata

We propose a framework for the exact probabilistic analysis of window-based pattern matching algorithms, such as Boyer-Moore, Horspool, Backward DAWG Matching, Backward Oracle Matching, and more. In particular, we show how to efficiently obtain the distribution of such an algorithm’s running time cost for any given pattern in a random text model, which can be quite general, from simple uniform ...

متن کامل

Representing Pattern Matching Algorithms by Polynomial-Size Automata

Pattern matching algorithms to find exact occurrences of a pattern S ∈ Σ in a text T ∈ Σ have been analyzed extensively with respect to asymptotic best, worst, and average case runtime. For more detailed analyses, the number of text character accesses X n performed by an algorithm A when searching a random text of length n for a fixed pattern S has been considered. Constructing a state space an...

متن کامل

An Algorithm to Compute the Character Access Count Distribution for Pattern Matching Algorithms

We propose a framework for the exact probabilistic analysis of window-based pattern matching algorithms, such as Boyer–Moore, Horspool, Backward DAWG Matching, Backward Oracle Matching, and more. In particular, we develop an algorithm that efficiently computes the distribution of a pattern matching algorithm’s running time cost (such as the number of text character accesses) for any given patte...

متن کامل

A family of fast exact pattern matching algorithms

A family of comparison-based exact pattern matching algorithms is described. They utilize multi-dimensional arrays in order to process more than one adjacent text window in each iteration of the search cycle. This approach leads to a lower average time complexity by the cost of space. The algorithms of this family perform well for short patterns and middle size alphabets. In such case the shift...

متن کامل

A Mechanically Checked Proof of the Correctness of the Boyer-Moore Fast String Searching Algorithm

We describe a mechanically checked proof that the Boyer-Moore fast string searching algorithm is correct. This is done by expressing both the fast algorithm and the naïve (obviously correct) algorithm as functions in applicative Common Lisp and proving them equivalent with the ACL2 theorem prover. The algorithm verified differs from the original Boyer-Moore algorithm in one key way: the origina...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007